Telco Customer Churn

Telco Customer Churn

In this article, we analyze and predict customer churn for Telco Customer Churn data.

Dataset

Columns Description
customerID Customer ID
gender Whether the customer is a male or a female
SeniorCitizen Whether the customer is a senior citizen or not (1, 0)
Partner Whether the customer has a partner or not (Yes, No)
Dependents Whether the customer has dependents or not (Yes, No)
tenure Number of months the customer has stayed with the company
PhoneService Whether the customer has a phone service or not (Yes, No)
MultipleLines Whether the customer has multiple lines or not (Yes, No, No phone service)
InternetService Customer’s internet service provider (DSL, Fiber optic, No)
OnlineSecurity Whether the customer has online security or not (Yes, No, No internet service)
OnlineBackup Whether the customer has an online backup or not (Yes, No, No internet service)
DeviceProtection Whether the customer has device protection or not (Yes, No, No internet service)
TechSupport Whether the customer has tech support or not (Yes, No, No internet service)
StreamingTV Whether the customer has streaming TV or not (Yes, No, No internet service)
StreamingMovies Whether the customer has streaming movies or not (Yes, No, No internet service)
Contract The contract term of the customer (Month-to-month, One year, Two years)
PaperlessBilling Whether the customer has paperless billing or not (Yes, No)
PaymentMethod The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))
MonthlyCharges The amount charged to the customer monthly
TotalCharges The total amount charged to the customer
Churn Whether the customer churned or not (Yes or No)

Preprocessing

Int Columns

Float Columns

Yes/No Columns

First, let's convert all Yes/No columns using as follows

\begin{cases} 0 &\mbox{No}\\ 1 &\mbox{Yes}\end{cases}

However, some other columns can be converted similarly; however, we need to create a new feature.

Note that,

This Column can be coded as follows

$$\mbox{InternetServiceType} = \begin{cases} 0 &\mbox{No} \\ 1 &\mbox{DSL}\\ 2 &\mbox{Fiber optic}\end{cases}$$

Since we have already included No interent service in InternetService, we can code the rest as,

\begin{cases} 0 &\mbox{No, No internet service}\\ 1 &\mbox{Yes}\end{cases}

Since, there is already a feature as PhoneService, for MultipleLines, we can try $$ \mbox{MultipleLines} = \begin{cases} 0 &\mbox{No, No phone service}\\ 1 &\mbox{Yes}\end{cases} $$

Other Columns

Contract

\begin{cases} 0 &\mbox{Month-to-month}\\ 1 &\mbox{One year}\\ 2 &\mbox{Two year} \end{cases}

Gender

$$ \mbox{Gender} = \begin{cases} 0 &\mbox{Female}\\ 1 &\mbox{Male}\end{cases} $$

PaymentMethod

In this case, we can not rank these values. Therefore,

Imputing Missing Values

Data Correlations

Let's take a look at the variance of the features.

Correlations of features with customer Churn.


Saving